Measuring the Effect of Conversational Aspects on Machine Translation Quality

نویسندگان

  • Marlies van der Wees
  • Arianna Bisazza
  • Christof Monz
چکیده

Research in statistical machine translation (SMT) is largely driven by formal translation tasks, while translating informal text is much more challenging. In this paper we focus on SMT for the informal genre of dialogues, which has rarely been addressed to date. Concretely, we investigate the effect of dialogue acts, speakers, gender, and text register on SMT quality when translating fictional dialogues. We first create and release a corpus of multilingual movie dialogues annotated with these four dialogue-specific aspects. When measuring translation performance for each of these variables, we find that BLEU fluctuations between their categories are often significantly larger than randomly expected. Following this finding, we hypothesize and show that SMT of fictional dialogues benefits from adaptation towards dialogue acts and registers. Finally, we find that male speakers are harder to translate and use more vulgar language than female speakers, and that vulgarity is often not preserved during translation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring the Translator\'s Solutions to the Translation of Conversational Implicatures from English into Persian: the Case of Tolkien\'s the Lord of the Rings

The present study aimed to examine the translatorchr('39')s solutions to the translation of conversational implicatures from English into Persian. To do so, 120 conversational implicatures were extracted from the novel the Lord of the Rings (Tolkien, 1954) and classified based on Gricechr('39')s (1975) categorization of Maxims, including quality, quantity, relevance, and manner. Mur Duenaschr('...

متن کامل

A Comparative Study of English-Persian Translation of Neural Google Translation

Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

The Effect of Genre Awareness on English Translation Quality and Pedagogy: A Case of News Reports Translation as an Academic Curriculum

To produce an adequate translation, language students are required to learn varieties of language features including syntax, semantics and pragmatics. Considering the curriculum language learners are face with, one can claim that almost all language students in Iran are taught these features in their academic settings including linguistic courses. Yet, there are some aspects of language which a...

متن کامل

Improving Spoken Language Translation by Automatic Disfluency Removal : Evidence from Conversational Speech Transcripts

Machine translation of spoken language has made significant progress in recent years, however, translation quality is still limited due to specific idiosyncrasies of spoken language; including the lack of well-formed sentences and the presence of disfluencies. In this paper, we investigate the effect of disfluencies on Statistical Machine Translation (SMT) and introduce an Automatic Disfluency ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016